Collection of Simultaneous Interpreting Patterns by Using Bilingual Spoken Monologue Corpus

نویسندگان

  • Hitomi Tohyama
  • Shigeki Matsubara
چکیده

This paper provides an investigation of simultaneous interpreting patterns using a bilingual spoken monologue corpus. 4,578 pairs of English-Japanese aligned utterances in CIAIR simultaneous interpretation database were used. This investigation is the largest scale as the observation of simultaneous interpreting speech. The simultaneous interpreters are required to generate the target speech simultaneously with the source speech. Therefore, they have various kinds of strategies to raise simultaneity. In this investigation, the simultaneous interpreting patterns with high frequency and high flexibility were extracted from the corpus. As a result, we collected 203 cases out of aligned utterances in which simultaneous interpreters’ strategies for raising simultaneity were observed. These 203 cases could be categorized into 12 types of interpreting pattern. It was clarified that 4.5 percent of the English-Japanese monologue data were fitted in those interpreting patterns. These interpreting patterns can be expected to be used as interpreting rules of simultaneous machine interpretation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bilingual Spoken Monologue Corpus for Simultaneous Machine Interpretation Research

Abstract This paper describes a large-scale bilingual corpus of spoken monologues and their simultaneous interpretation, which has been constructed at CIAIR. The corpus has the following characteristics: (1) English and Japanese speeches are recorded in parallel, (2) the data contains monologue speeches such as lecture and self-introduction, and (3) the exact beginning and ending times are prov...

متن کامل

Construction of Chunk-Aligned Bilingual Lecture Corpus for Simultaneous Machine Translation

Abstract With the development of speech and language processing, speech translation systems have been developed. These studies target spoken dialogues, and employ consecutive interpretation, which uses a sentence as the translation unit. On the other hand, there exist a few researches about simultaneous interpreting, and recently, the language resources for promoting simultaneous interpreting r...

متن کامل

Interpreting Unit Segmentation of Conversational Speech in Simultaneous Interpretation Corpus

The speech-to-speech translation system is becoming an important research topic with the progress of the speech and language processing technology. Considering efficiency and the smoothness of the cross-lingual conversation, the simultaneity of the translation processing has a great influence on the performance of the system. This paper describes interpreting unit segmentation of conversational...

متن کامل

Incremental dependency parsing of Japanese spoken monologue based on clause boundaries

In applications of spoken monologue processing such as simultaneous machine interpretation and real-time captions generation, incremental language parsing is strongly required. This paper proposes a technique for incremental dependency parsing of Japanese spoken monologue on a clause-by-clause basis. The technique identifies the clauses based on clause boundaries analysis, analyzes the dependen...

متن کامل

Toward a Broad-coverage Bilingual Corpus for Speech Translation of Travel Conversations in the Real World

Abstract At ATR Spoken Language Translation Research Laboratories, we are building a broad-coverage bilingual corpus to study corpus-based speech translation technologies for the real world. There are three important points to consider in designing and constructing a corpus for future speech translation research. The first is to have a variety of speech samples, with a wide range of pronunciati...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006